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ABSTRACT 



We have constructed an extended halo model (EHM) which relates the total stellar 

mass and star-formation rate (SFR) to halo mass (M n ). An empirical relation between 

IT^i ■ the distribution functions of total stellar mass of galaxies and host halo mass, tuned 

*Oj ' to match the spatial density of galaxies over < z < 2 and the clustering properties 

C^ , at z ~ 0, is extended to include two different scenarios describing the variation of SFR 

on Mh. We also present new measurements of the redshift evolution of the average 
SFR for star- forming galaxies of different stellar mass up to z — 2, using data from 
the Herschel Multi-tiered Extragalactic Survey (HcrMES) for infrared-bright galaxies. 
Combining the EHM with the halo accretion histories from numerical simulations, 
we trace the stellar mass growth and star-formation history in halos spanning a range 
of masses. We find that: (1) The intensity of the star-forming activity in halos in the 
probed mass range has steadily decreased from z ~ 2 to 0; (2) At a given epoch, halos 
in the mass range between a few times lO n M0 and a few times 1O 12 M are the most 
efficient at hosting star formation; (3) The peak of SFR density shifts to lower mass 
halos over time; (4) Galaxies that are forming stars most actively at z ~ 2 evolve 
into quiescent galaxies in today's group environments, strongly supporting previous 
claims that the most powerful starbursts at z ~ 2 are progenitors of today's elliptical 
galaxies. 

Key words: (cosmology:) large-scale structure of Universe - infrared: galaxies - 
methods: statistical - submillimetre - cosmology: observations. 
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1 INTRODUCTION 

In the past 20 years or so, impressive progress has been made 
in characterising the evolution of galaxy physical properties 
over a large fraction of cosmic time. A consistent picture, at 
least crudely, has emerged in which the global stellar mass 
density decreases by a factor of 2 or so from z ~ to 2 
and the comoving cosmic star-formation rate (SFR) density 
increases by more than a factor of 10 over the past 8 Gyr, 
peaks around z ~ 2 to 3 and then declines almost linearly 
with time to higher redshift (e.g., Lilly et al. 1996; Madau et 
al. 1998; Rudnick et al. 2003; Dickinson et al. 2003; Schimi- 
novich et al. 2005; Hopkins & Beacom 2006; Arnouts et al. 
2007; Pascale et al. 2009; Bouwens et al. 2010). The key 
question that dominates both observational and theoretical 
efforts today is what physical processes play the dominant 
role in driving the evolution of the cosmic star- formation ac- 
tivity. Processes such as a decline in the major-merger rate, 
reduced gas accretion in halos, feedback from central mas- 
sive black holes and supernova, and environmental effects 
(ram pressure stripping of gas, strangulation of the extended 
halo, etc.) can all impact on the star-formation activity (e.g., 
Keres et al. 2005, 2009; Bell et al. 2005; Bower et al. 2006; 
Croton et al. 2006; Somerville et al. 2008; Lotz et al. 2008). 

Since galaxies form in dark matter halos and their evo- 
lution is influenced by the accretion and successive merging 
of halos (White & Rees 1978; Fall & Efstathiou 1980; Blu- 
menthal et al. 1984), it is reasonable to assume that the 
physical properties of galaxies should correlate to those of 
the host halos (such as the mass of the halo). Observation- 
ally, finding the mass of the host halo can be achieved in a 
number of ways, e.g., weak gravitational lensing (McKay et 
al. 2001; Hoekstra et al. 2004; Sheldon et al. 2004; Mandel- 
baum et al. 2006; Sheldon et al. 2009), dynamical measure- 
ment of satellite galaxies (McKay et al. 2002; van den Bosch 
et al. 2004; Conroy et al. 2007; More et al. 2011) and X-ray 
studies (Lin et al. 2003; Lin & Mohr 2004; Vikhlinin et al. 
2006). These techniques are at present expensive in terms 
of observing time and limited to low z and small dynamical 
range in halo mass. 

Alternatively, the halo model provides a simple but 
powerful way to statistically link galaxies with halos. In 
its simplest form, the halo occupation distribution (HOD), 
which gives the probability of finding N galaxies (with some 
specified properties) in a halo of mass Mh is used to interpret 
galaxy clustering (e.g., Peacock & Smith 2000; Seljak 2000; 
Scoccimarro et al .2001; Berlind & Weinberg 2002; Zehavi 
et al. 2004). Modifications of the HOD include the condi- 
tional luminosity function (CLF) which encodes the number 
of galaxies as a function of luminosity in a given halo (Yang 
et al. 2003; van den Bosch et al. 2003; Vale & Ostriker 2004) 
and the conditional stellar mass function (CSMF) which en- 
codes the number of galaxies as a function of stellar mass in 
a given halo (Yang et al. 2009; Moster et al. 2010; Behroozi 
et al. 2010). 

In this paper, we build an extended halo model (EHM) 
to connect stellar mass, m,, and SFR, if), with the host 
halo mass, Mh. The EHM is a hybrid model composed of a 
parametrised m* - Mh relation and a non-parametric m* - ip 
relation. The first part of the EHM is to use a parametrised 
relation between the distribution of stellar mass and halo 
mass, i.e. the CSMF, to describe the statistical relation be- 



tween m* and Mh. The parameters in the CSMF at z ~ 
are tuned by the spatial density and clustering of galaxies 
while their evolution in the redshift range < z < 2 is con- 
strained by galaxy SMFs only. The second part of the EHM 
is to extend the CSMF to the joint distribution in m, and 
t/iasa function of Mh, using two different scenarios for the 
role of Mh in determining the distribution of i\> at fixed m, . 
This second part is non-parametric as we use the observed 
conditional SFR distributions at fixed m, as direct inputs. 
The key to building the EHM is a large sample of galaxies 
with reliable m* and ip estimates. The Herschel Multi-tiered 
Extragalactic Survey (HerMES; Oliver et al. 2011) covering 
most of the well-studied extragalactic fields with ancillary 
data from the X-ray to radio is the perfect place to start 
such a project. 

The layout of the paper is as follows. In Section 2, first 
we briefly describe the published measurements used to con- 
strain the EHM. Then, we describe the data-sets used to 
derive the stellar masses and SFRs of high-redshift galax- 
ies in HerMES fields. In Section 3, we present the CSMF 
in both the local Universe and at high redshift. The evo- 
lution of the stellar content as a function of Mh is derived 
using the CSMF as a function of redshift and the halo ac- 
cretion history from N-body simulations. In Section 4, we 
extend the CSMF to the 2-D distribution of galaxies in the 
(■;/>, m*) plane as a function of Mh. The evolution of the star- 
formation activity as a function of Mh is derived using the 
EHM as a function of redshift and the halo accretion history. 
Finally, conclusions and discussions are presented in Section 
5. Unless otherwise stated, we assume f!jw = 0.3, Qa — 0.7, 
ag = 0.8 and h = 0.7. All magnitudes are in the AB system. 



2 DATA-SETS 

To constrain the m* - Mh relation at z ~ 0, we use the 
published stellar mass function (SMF) (Guo et al. 2010) and 
correlation functions of the SDSS galaxies (Li et al. 2006). 
To constrain the redshift evolution of the m» - Mh relation, 
we use the published SMFs in Perez-Gonzalez et al. (2008) 
based on a combined sample of 3.6 and 4.5 /im selected 
sources in the HDF-N, the CDF-S and the Lockman Hole. 
The clustering properties of high-redshift galaxies are not 
used to constrain the evolution parameters in the m» - Mh 
relation due to issues explained in Section 3.2. 

To extend the CSMF to the joint distribution in m* and 
^asa function of Mh, we use the conditional probability dis- 
tribution function (PDF) of SFR of the entire population as 
a function of m». The conditional PDF of SFR of galax- 
ies in the local Universe is taken from Salim et al. (2007). 
To derive the conditional SFR distribution as a function 
of m* in the distant Universe, we use galaxies observed in 
three well-studied extragalactic fields, the Extended Chan- 
dra Deep Field-South (ECDFS) field, the COSMOS field 
and the Extended Groth Strip (EGS). We will describe in 
detail the data-sets used in each field below. 



2.1 COSMOS 

The COSMOS photometric redshift catalogue derived from 
broad and medium bands (GALEX FUV and NUV, opti- 
cal to infrared data u* BjVjg + r + i + i* z + JK S K, 14 medium 
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Figure 1. The conditional PDF of SFR as a function of stellar mass, in six redshift bins (from left to right and top to bottom, 
z = [0.2, 0.5], [0.5, 0.8], [0.8, 1.0], [1.0, 1.3], [1.3, 1.6] and [1.6,2.0]), averaged over COSMOS, ECDFS and EGS. The star-forming sequence 
can be clearly seen in each panel and it evolves upwards roughly independently of stellar mass. 
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Figure 2. The redshift evolution of the average SFR as a function of stellar mass for star-forming galaxies and the best-fit power-law 
in each redshift slice. The open (filled) circles represent values which are derived from samples below (above) the completeness limit in 
ECDFS, COSMOS and EGS (see Table 1). Errors include the field-to-field variations and photometric redshift uncertainty. In the left 
panel, both parameters in the power-law are allowed to vary. In the right panel, the power-law slope is fixed to be 0.37 which is the 
average value over different redshift slices. Note that the power-law fitting is only applied to the filled circles. The best-fit parameters in 
both panels are listed in Table 2. 



and narrow bands from Subaru and 4 IRAC channels) is de- 
scribed in Ilbert et al. (2009). We use an updated version 
(vl.8 dated from the 13th of July 2010) of Ilbert et al. (2009). 
The quality of the photometric redshift is very high with lcr 



(1 + z) ~ 0.007 at iJ B < 22.5 



At iX B < 24. 



and z < 1.25, 



lo- in (1 + z) ~ 0.012. The deep NIR and IRAC coverage 



enables the photo-2 to be extended to z ~ 2, with lcr in 
(1 + z) ~ 0.06 at i+ B ~ 24. Following Ilbert et al. (2010), 
we construct a mass selected sample as generated from the 
3.6 /an catalogue of the S-COSMOS survey (Sanders et al. 
2007). We cross-match the 3.6 /an and the latest photo-z 
catalogue by taking the nearest match within 1" . The prob- 
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Table 1. Stcllar-mass-selected samples in COSMOS, EGS and 
ECDFS. The columns are redshift range and stellar mass m* 
limit in each field above which our samples are regarded as rep- 
resentative. 





COSMOS 


ECDFS 


EGS 


z range 


log(m. 


v[M ]) 


log(m*[M ]) 


log(m*[M ]) 


z = [0.2,0.5] 


9.8 




9.5 


9.9 


z = [0.5,0.8] 


10.1 




9.5 


10.0 


z = [0.8, 1.0] 


10.1 




9.5 


10.1 


z = [1.0,1.3] 


10.2 




9.7 


10.1 


z = [1.3,1.6] 


10.4 




9.9 


10.3 


z = [1.6,2.0] 


10.7 




10.1 


10.6 



ability of incorrect identification is < 1% (Ilbert et al. 2010). 
We then select sources with fs.e ^ 5/xJy (the 90% complete 
limit), around 2.8% of which are not matched to an optical 
counterpart. Using the public photo-z catalogue from the 
NEWFIRM Medium-band Survey (Whitaker et al. 2011) 
covering a small area of the COSMOS field but is deeper 
than Ilbert et al. (2009), we estimate that only 1% (2.5%) 
of the sources with /3.6 ^ 5/iJy lie at z < 1.6 (z < 2.0) or 
do not have a photo- z estimate. As we are only concerned 
with the relation between m*, tp and Mh at z < 2, we will 
ignore this 1% of 3.6 /jm sources in our analysis. 



2.2 ECDFS 

We use the Multiwavelength Survey by Yale-Chile 
(MUSYC) Subaru vl.O Catalog (Cardamone et al. 2010) 
containing over 84400 sources. The catalog includes pho- 
tometry in 32 MUSYC images of the ECDF-S region, includ- 
ing optical to infrared data (UU38BVRIzJHK), 18 medium 
bands from Subaru and 4 IRAC channels as part of the 
SIMPLE survey (Damen et al. 2010), for all sources de- 
tected in the combined BVR image. Photometric redshifts 
are determined using the EASY code (Brammer et al. 2008) . 
The quality of the photometric redshifts is very high, with 
la = 0.007 in (1 + z) in the z = [0.1, 1.2], similar to that of 
the COSMOS field. At z = [1.2,3.7], the photometric red- 
shift accuracy gets worse with la = 0.02 in (1 + z). We select 
3.6 /jm sources above the completeness limit which is 1/iJy. 



2.3 EGS 

We use an 3.6 + 4.5 //m selected catalogue in the Ex- 
tended Groth Strip (EGS) containing 28-band photome- 
try from the ultraviolet to the far-infrared (GALEX FUV 
and NUV, CFHTLS u*g'r'i'z', MMT u'giz, CFHT12k BRI, 
ACS U 6 06«8i4, Subaru R, NICMOS Juo-ffieo, MOIRCS K s , 
CAHA JK S , WIRC JK and 4 IRAC channels) (Barro et al. 
2011a, 2011b). The typical photometric redshift accuracy is 
ler = 0.034 in (1 + z), with a catastrophic outlier fraction 
of 2%. We apply the 90% completeness limit at 3.6 /an by 
selecting sources with /3.6 ^ 2.3/xJy over areas with homo- 
geneous depth 52.025° ^ 5 ^ 53.525°. We also mask out 
regions in the wings of bright stars. 



2.4 Deriving stellar mass and SFR from HerMES 
and ancillary data 

We use the Le Phare code (Arnouts et al. 2002; Ilbert et 
al. 2006) and the Bruzual & Chariot (2003) stellar popu- 
lation synthesis (SPS) models to derive stellar properties 
such as stellar mass and SFR. We use the same parameters 
as in Ilbert et al. (2010) to generate the SED templates, 
e.g., a Chabrier initial mass function (IMF), two different 
metallicities (solar and sub-solar) and an exponentially de- 
clining star-formation history. Dust extinction is applied to 
the templates using the Calzetti et al. (2000) law. 

We cross-match the 3.6 fim catalogue in each field with 
the 24 /im catalogue by taking the nearest match within 2" . 
The SPIRBJ fluxes of the 24 /im sources are obtained using 
a combination of linear inversion and model selection tech- 
nique (Roseboom et al. 2010; Rosebomm et al. 2012). With 
SPIRE, we are able to probe the rest-frame far-IR region to 
constrain the infrared luminosity Lir, (integrated from 8 to 
1000 /im). Previous studies extrapolate Lir from the 24 /jm 
data and the resulting Lir can be overestimated by a factor 
of five at z > 1.5 (Papovich et al. 2007; Daddi et al. 2007; 
Murphy et al. 2009; Nordon et al. 2010; Elbaz et al. 2010). 
We use the Chary & Elbaz (2001) templates to fit the in- 
frared SEDs of galaxies observed at 24 /im and at least one 
SPIRE band to calculate ipm = 1.09 x 10~ 10 x L IR (Ken- 
nicutt 1998). For galaxies not observed in any SPIRE band 
(around 70% of the 3.6 (im - selected sample), we use ^>sed 
derived from SED fitting to the UV to MIR photometric 
data. 

In each field, we generate 10 Monte Carlo realisa- 
tions of the original photo-z catalogue using the redshift 
PDF of each galaxy and repeat the stellar mass and SFR 
calculation. In Fig. [l] we plot the conditional SFR dis- 
tributions as a function of m* in 6 redshift bins, z = 
[0.2, 0.5], [0.5, 0.8], [0.8, 1.0], [1.0, 1.3], [1.3, 1.6] and [1.6, 2.0], 
averaged over all Monte Carlos realisations in COSMOS, 
ECDFS and EGS. The star- forming sequencqjcan be clearly 
seen and it evolves upwards roughly independently of m*. 
The number of quiescent massive galaxies gradually builds 
up as redshift decreases. In each redshift bin, the conditional 
SFR distribution in a given stellar mass bin can be modelled 
as the sum of two Gaussian distributions which represent the 
star-forming and passive populations 



<E>(l/'|m*) = <&stai— forming(V , l m *) + "^passive (^|m*). 



(1) 



In this paper, we define star-forming galaxies as those with 

SFR ^ Wster-formmg ~ 2 (Tatar -forming, where (V>) star _f orming 

and (Tstar-forming is the mean SFR and standard deviation 
of the star-forming population respectively. The advantage 
of our definition of star-forming galaxies is that it naturally 
takes into account the fact that the SFR of a star-forming 



The Spectral and Photometric Imaging Receiver (SPIRE; Grif- 
fin et al. 2010) is one of three scientific instruments on board 
Herschel (Pilbratt et al. 2010). It operates in three wavelength 
bands centred at 250, 350 and 500 fim. 

2 For star-forming galaxies, there exists a strong correlation be- 
tween stellar mass m* and SFR ip (with an estimated intrinsic 
scatter ~ 0.3 dex) from z ~ to 3 (e.g., Elbaz et al. 2007; Daddi 
et al. 2007; Noeske et al. 2007; Rodighiero et al. 2010; Karim et 
al. 2011). 
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Figure 3. Left: The measured SMF of the local Universe based on SDSS DR7 (Guo et al. 2010) compared with the SMF derived from 
our best-fit CSMF. Central galaxies (the red dot dashed line) dominate the SMF over the entire mass range probed. Right: The measured 
projected correlation functions of the SDSS galaxies in different stellar mass bins (Li et al. 2006) compared with the correlation functions 
derived from our best-fit CSMF. Note that the stellar mass bins shown in each panel are calculated with h = 0.7. For example, the top 
left panel shows the projected correlation function of galaxies in the stellar mass bin log 10 m Jl (MQ) = [9.0, 9.5] assuming h = 0.7. 

Table 2. A two-parameter fit of the form ^(M@/yi) = ax (m t /Mq)P to the stellar mass dependence of the average SFR for star-forming 
galaxies, averaged over ECDFS, COSMOS and EGS. 
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P 


x 2 /dot 


z = [0.2,0.5] 
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0.17 


-3.14 ±0.46 


0.37 


0.20 


z = [0.5,0.8] 


-1.41 ±1.92 


0.25 ±0.18 


0.18 


-2.73 ±0.37 


0.37 


0.26 


2 = [0.8, 1.0] 


-1.52 ±2.04 


0.28 ±0.19 


0.30 


-2.54 ±0.37 


0.37 


0.34 


2 = [1.0,1.3] 


-3.52 ±2.66 
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0.37 


0.14 
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0.05 
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0.37 


0.14 


2 = [1.6,2.0] 


-1.86 ±2.65 


0.39 ±0.24 


0.03 


-1.62 ±0.35 


0.37 


0.03 



galaxy increases with increasing stellar mass and increasing 
redshift (as shown in Fig. 1). In Fig. [2] we plot the redshift 
evolution of the average SFR as a function of stellar mass for 
star-forming galaxies and the best-fit power-law to points 
above the stellar mass completeness limit in each redshift 
slice (see Table 1). The best-fit parameters in the power-law 
fitting of the m* - ip relation are listed in Table 2. 



3 EHM: 1. CONNECTING STELLAR MASS 
WITH HALO MASS 

3.1 The stellar-to-halo mass relation at z ~ 

We choose the CSMF, $(m*|Mh), which specifies the num- 
ber of galaxies of stellar mass m* that reside in a halo of mass 
Mh, to describe the stellar-to-halo mass relation. Details of 
the parametrisation of the CSMF and the fitting process to 
the observed spatial density and clustering of galaxies can be 
found in Appendix lAl The left panel in Fig. [3] compares the 
measured SMF of the local Universe (Guo et al. 2010) with 
the best-fit SMF from our CSMF. Note that in comparing 
to the observed SMF, the predicted SMF from the CSMF 
(the black dashed line in the left panel of Fig. [3} has been 



convolved with a log-normal distribution with its width set 
to 0.1 dex (Li & White 2009) to account for statistical er- 
rors in the observational estimate of stellar mass. It is clear 
that central galaxies dominate the SMF over the entire mass 
range probed. At m„ > 1O 8 M0, satellite galaxies make up 
18% of the entire population. The fraction of satellite galax- 
ies decreases rapidly with increasing stellar mass at the high 
mass end. At m» > 5 x 10 10 Mq, satellite galaxies account 
for 8% of the entire population while at m* > 10 1 Mq, the 
fraction of satellites is < 1%. The projected correlation func- 
tions in five stellar mass bins (Li et al. 2006) are compared 
with the best-fit from our CSMF in the right panel in Fig. [3] 
The 1-halo term (due to galaxy pairs residing in the same ha- 
los) dominates the clustering signal on small scales and the 
2-halo term (due to galaxy pairs in separate halos) dom- 
inates the clustering signal on large scales. The transition 
between the 1-halo and 2-halo term is at ~ 1/h Mpc in all 
stellar mass bins. The large-scale 2-halo term (proportional 
to the linear bias factor) increases with m» indicating more 
massive galaxies reside in more massive halos. In the two 
lowest stellar mass bins, the predicted clustering signal lies 
below the measured on the smallest scales (< 0.2/i -1 Mpc). 
It can not be due to our particular choice of the galaxy den- 
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Figure 4. The predicted average total stellar mass as a function 
of Afh derived from the best-fit CSMF of the local Universe. The 
predicted average stellar mass of central galaxies agrees reason- 
ably well with results from galaxy-galaxy lensing (Mandelbam et 
al. 2006; Hoekstra 2007), satellite kinematics (Conroy et al. 2007; 
More et al. 2011), and other empirical models of the stellar-to- 
halo mass relation (Behroozi et al. 2010; Moster et al. 2010). The 
vertical bar on the left indicates the typical error in m*. The 
vertical dotted line marks the characteristic Mh in the m* - Mh 
relation for central galaxies. 



sity profile inside a dark matter halo because we do not see 
the same effect in the three more massive mass bins. A full 
investigation of the cause is deferred until the full covariance 
matrix of the correlation function is available. 

With the parameters in the CSMF tuned by the galaxy 
abundance and clustering data, we can now predict the av- 
erage total stellar mass as a function of halo mass can be 
calculated from the best-fit CSMF, 



( m *)totai = : / m * x <K m *l M h)dm,t 

[$cen(ro*|M h ) + $ sat (m*|M h )]dm* 

= <™*>c ea +<™*> 8 at> (2) 

which is plotted in Fig. [4] The average stellar mass of the 
central galaxies as a function of halo mass from our CSMF 
model agrees reasonably well with constraints from galaxy- 
galaxy lensing (Mandelbaum et al. 2006; Hoekstra 2007), 
satellite dynamics (Conroy et al 2007.; More et al. 2011) 
and galaxy group catalogues (Yang et al. 2009). Our re- 
sult on the stellar-to-halo mass relation for central galaxies 
also agrees well with other empirical models, i.e. Moster et 
al. (2010) and Behroozi et al (20100. Both Moster et al. 
(2010) and Behroozi et al. (2010) fit to the observed SMF 
only. The good agreement between different empirical mod- 



3 Moster et al. (2010) uses an almost identical CSMF formalism 
to what is used in this paper. Essentially, it is a double power- 
law connected at some characteristic mass scale. Behroozi et al 
(2010) uses a different CSMF formalisation. The main difference 
is that for high-mass galaxies, the m* - Mh relation asymptotes 
to a sub-exponential function instead of a power-law. 



Table 3. Volume-limited and stellar-mass-selected subsamplcs 
in COSMOS and ECS used to calculate correlation functions. 
The columns are sample name, redshift range, number of galaxies 
and stellar mass range. Note that the number of galaxies in each 
sample varies slightly in different Monte Carlo realisations. 



Sample 



z range 



Ngal logl0"i* (Mq) 



zlMl 


COSMOS) 


z = 


[0.2,0.5] 


2117 


[9.8, 10.1] 


zlM2 


COSMOS) 


z = 


[0.2,0.5] 


2025 


[10.1,10.4] 


zlM3 


COSMOS) 


z = 


[0.2,0.5] 


2175 


> 10.4 


z2Ml 


COSMOS) 


z = 


[0.5,0.8] 


2311 


[10.1, 10.4] 


z2M2 


COSMOS) 


z = 


[0.5,0.8] 


2641 


[10.4, 10.6] 


z2M3 


COSMOS) 


z = 


[0.5,0.8] 


2369 


> 10.6 


z3Ml 


COSMOS) 


z = 


[0.8,1.0] 


4821 


[10.1, 10.5] 


z3M2 


COSMOS) 


z = 


[0.8,1.0] 


5051 


> 10.5 


z4Ml 


COSMOS) 


z = 


[1.0,1.3] 


4111 


[10.2, 10.6] 


z4M2 


COSMOS) 


z = 


[1.0,1.3] 


3950 


> 10.6 


z5Ml 


COSMOS) 


z = 


[1.3,1.6] 


3867 


> 10.4 


z6Ml 


COSMOS) 


z = 


[1.6,2.0] 


2425 


> 10.7 


zlMl 


ECS) 


z = 


[0.2,0.5] 


2064 


>9.9 


z2Ml 


ECS) 


z = 


[0.5,0.8] 


2186 


[10.0, 10.2] 


z2M2 


ECS) 


z = 


[0.5,0.8] 


2045 


>10.2 


z3Ml 


ECS) 


z = 


[0.8,1.0] 


2650 


>10.1 


z4Ml 


ECS) 


z = 


[1.0,1.3] 


2965 


>10.1 


z5Ml 


ECS) 


z = 


[1.3,1.6] 


2333 


>10.3 


z6Ml 


ECS) 


z = 


[1.6,2.0] 


1731 


>10.6 



els indicates that an accurate SMF is the most important 
constraint in determining the statistical relation between m* 
and Mh. In our CSMF model, the average m* of the central 
galaxies grows roughly as M^' 16 at the low-mass end and as 
M^' 71 at the high-mass end. The characteristic halo mass for 
central galaxies in our model, which is where the low- and 
high-mass power-laws meet, is 5 x lO n M0. The correspond- 
ing stellar mass at the characteristic halo mass is ~ 10 10 Mq, 
which is where local galaxies are found to divide into two 
distinct families with less massive galaxies showing younger 
stellar populations, optically blue colours and disk-like mor- 
phologies, and more massive galaxies exhibiting older stellar 
populations, optically red colours, and more bulge-like mor- 
phology (Kauffmann et al. 2003). Therefore, the different 
stellar mass build-up history, indicated by the different m* 
- Mh relation below and above Mh = 5 x 10 11 A/q, may ex- 
plain the observed division in galaxy properties below and 
above m* ~ 10 10 M Q . 



3.2 The stellar-to-halo mass relation at high z 

In Table 3, we list a series of volume- and stellar-mass- 
limited subsamples in six redshift bins in COSMOS and 
EGS. The projected correlation function for each subsam- 
ple in COSMOS and EGS is plotted in Fig. O More de- 
tails on how projected correlation functions are calculated 
can be found in Appendix B. In redshift bins where multi- 
ple stellar- mass-limited subsamples exist, it seems that more 
massive galaxies generally show stronger clustering although 
the large errors prevent any firm conclusions to be drawn. 
This is consistent with Meneux et al. (2009) who studied the 
clustering dependence on m, in the redshift bin z = [0.2, 1] 
using the first 10K redshifts from the zCOSMOS survey and 
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Figure 5. The projected correlation functions of stellar-mass-limited subamples listed in Table 3 in six redshift bins zl = [0.2,0.5], 
zl = [0.5, 0.8], z3 = [0.8, 1.0], z4 = [1.0, 1.3], z5 = [1.3, 1.6] and z6 = [1.6, 2.0]. The solid lines are the predicted correlation function from 
our best-fit CSMF. Error bars include both the bootstrapping error and the photometric redshift error. In redshift bins where multiple 
stellar-mass-limited subsamples exist, more massive galaxies seem to show a higher clustering amplitude than less massive galaxies. 



found a mild dependence on m* especially on small scales 
(see Fig. [5]). 

We derive the m* - Mh relation for the local Universe 
by fitting to both the spatial density and clustering of galax- 
ies. At high z, however, we will only use the SMFs and 
not the correlation functions presented above. This is be- 
cause the correlation function is extremely sensitive to cos- 
mic variance. A large difference in the correlation functions 
between COSMOS and VVDS was reported in Meneux et al. 
(2009). Also, the flat shape in the measured zCOMOS cor- 
relation functions (shown in Fig. [5]) over the redshift range 
2 = [0.6, 1.0] has been attributed to an overabundance of 
high-density regions (de la Torre et al. 2010). We show the 
measured SMFs in Perez-Gonzalez et al. (2008) based on a 
combined sample of 3.6 and 4.5 fim selected sources in the 
HDF-N, the CDF-S and the Lockman Hole and the best-fit 
from our CSMF model in Fig. [7] Note that in comparing 
to the observed SMF, the predicted SMF (i.e. the intrin- 
sic SMF) from the best-fit CSMF model (the black dashed 
line in Fig. [7]) has been convolved with a log-normal distri- 
bution with its width set to 0.3 dex (Perez-Gonzalez et al. 
2008) to account for statistical errors in the observational 
estimate of stellar mass. The SMF increases over time but 
mostly in low-mass systems. The contribution from satellites 
also grows over time. In Fig. \S\ the projected correlation 
functions in COSMOS and ECS are compared with the pre- 
dicted correlation functions from our best-fit CSMF. There 
is a relatively good overall agreement between the two. On 
large scales, the measured correlation function falls under 
the predicted curve, which is due to integral constraint. If 
the galaxy number density fluctuations in the probed volume 
are smaller than the average over a cosmologically represen- 
tative volume, then the measured correlation function will 
be biased low by a constant, which is equal to the fractional 
variance of the number counts in cells. This effect is signif- 
icant if the survey field is small. In Fig. [6] we compare the 



projected correlation functions of stellar-mass-limited sam- 
ples from the zCOSMOS 10K sample (Meneux et al. 2009) 
with the predicted clustering from our model and again find 
relatively good overall agreement. 

We plot the average m* - Mh relation as a function of 
Mh for central galaxies in the left panel in Fig. [5] The char- 
acteristic halo mass scale in the m* - A/h relation for cen- 
tral galaxies has increased with increasing redshift, chang- 
ing from ~ 5.0 x 10 n M Q at z = 0.1 to ~ 1.1 x 10 12 M Q 
at z — 1.8. In the right panel of Fig. [8] we plot the m*-to- 
Mh (a measure of the integrated star-formation efficiency) 
as a function of Mh for central galaxies. It is clear that the 
integrated star-formation efficiency is low in both low-mass 
and high-mass halos in all redshift slices at < z < 2. In 
low-mass halos, star-formation efficiency is suppressed pos- 
sibly due to supernova feedback which can re-heat the inter- 
stellar stellar medium, heat gas in the dark matter halo or 
even eject gas altogether (Springel & Hernquist 2003; Brooks 
et al. 2007; Ceverino & Klypin 2009). In high-mass halos, 
star-formation efficiency is also suppressed possibly due to 
gravitational heating (Khochfar & Ostriker 2008; Dekel & 
Birnboim 2008) and/or feedback from AGN which trans- 
fers energy to the halo gas (Croton et al. 2006; Bower et al 
2006; Monaco et al. 2007). The peak of the average stellar- 
to-halo mass ratio for central galaxies has shifted towards 
lower mass halos over time. 

In Fig. [5] we plot the stellar mass build-up history as 
a function of halo mass at the present day by evolving Mh 
at a particular redshift to Mh at z = using the halo mass 
accretion rate from Fakhouri et al. (2010), 



/ dM h 



dt 



46.1 



Mh 
10 12 



(l+i.iiz) x/«m(i + z)* + n A .(3) 



So we can trace the evolution of the stellar content in the 
same halo along any vertical line in Fig. [9] It is clear that 
the stellar mass assembly happened much earlier in massive 
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Figure 6. The projected correlation functions of stellar-mass-limited samples (black points: galaxies with log(m*/MQ) ^ 9.0; red 
points: log(m*/M0) ^ 9.5; green points: log(m*/MQ) ^ 10.0; blue points: log(m*/MQ) ^ 10.5) in three redshift bins zl = [0.2,0.5], 
z2 = [0.5,0.8] and z3 = [0.8,1.0] from the zCOSMOS 10K sample (Meneux et al 2009). The solid lines are the predicted correlation 
function from our best-fit CSMF. The flat shape in the measured zCOMOS correlation functions in the middle panel has been explained 
by an overabundance of high-density regions (de la Torre et al. 2010). 
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Figure 7. The measured SMFs in different redshift bins from Perez-Gonzalez et al. (2008). The redshift range is indicated in each panel. 
The dashed black line in each panel is the underlying SMF predicted from the best-fit CSMF. The solid black line in each panel is the 
convolution of the dashed black line with a log-normal distribution which represents the statistical error in the observational estimate of 
stellar mass. The blue line is the present-day SMF (i.e. the black dashed line in the left panel of Fig. [3jl. 



halos than in less massive halos. In halos more massive than 
1O 13 M0 (the present-day value), the stellar mass of the cen- 
tral galaxies has increased by at most a factor of a few. But 
in less massive halos, the stellar mass of the central galaxies 
has grown by an order of magnitude or more. This is con- 
sistent with the downsizing scenario of galaxy formation. 



4 EHM: 2. CONNECTING STELLAR MASS, 
SFR AND HALO MASS 

Now we can extend the CSMF to the 2-D distribution 
$(V', m*|Mh), which specifies the number of galaxies as a 
function of m* and ip at fixed M^. Using conditional prob- 
ability theory, one can show that 



$(i/),m*|M h ) = $(m*|M h ) x $(-0|m*,Mh). 



(4) 



If the distribution of SFR is only dependent on m* and at 
most weakly dependent on M^, then one can assume 



$(?/>, m* | M h ) » $(m*|M h ) x $(-0|m*). 



(•») 



We will refer to this simplification as Scenario A. 

However, it is important to realise that the distribution 
of SFR at fixed m* may be different in halos of different 
masses, which is a measure of the Mpc-scale environment. 
Using group catalogues constructed from the SDSS DR5 
(Yang et al. 2007), Kimm et al. (2009) studied the fraction of 
passive galaxies, /passive, as a function of m, and Mh- Within 
the error bars, it is difficult to tell whether /passive at fixed 
m* has any significant dependence on M-^. However, Peng 
et al. (2010) using both the SDSS and zCOSMOS dataset 
found that the SFR of star-forming galaxies at fixed m* is 
completely independent of environment (measured by the 
5th nearest neighbour density estimatorjj, but /passive de- 
pends on environment even at fixed m,. Therefore, in this 
paper, we adopt a second scenario in building the 2-D distri- 
bution in the (tp,m t ) plane as a function on halo mass. We 

4 Since the SFR distribution at a given stellar mass is indepen- 
dent of environment for star-forming galaxies, we will only need 
to use Scenario A to connect SFR with halo mass for the star- 
forming galaxy population. 
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Figure 8. Left: The predicted average stellar mass of central galaxies as a function of halo mass from the best-fit CSMF. For clarity, 
we only plot errors on a few selected redshift slices and halo mass bins. Different lines are colour-coded by redshift as indicated in the 
panel. The characteristic halo mass scale for central galaxies increases with increasing redshift. Right: The average stellar-to-halo mass 
ratio for the central galaxies versus the host halo mass predicted from the best-fit CSMF. It is clear that the star-formation efficiency is 
low in both low-mass and high-mass halos and the peak in the stellar-to-halo mass ratio shifts to lower mass halos over time. 
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Figure 9. Left: The predicted average stellar mass of the central galaxies as a function of halo mass at the present day. The halo mass 
is evolved to z = using the halo mass accretion history from Fakhouri et al. (2010). The evolution of the stellar content as a function 
of halo mass. Different lines are colour-coded by redshift as indicated in the panel. Right: The average stellar-to-halo mass ratio versus 
halo mass at the present day. The build-up of stellar mass happened early on in massive halos. 



assume that the fraction of passive galaxies at fixed m„ has 
a power-dependence on Mh, i.e. /passive (Mh[m*) oc M^ . 
Furthermore we assume that all galaxies are passive in very 
massive halos (corresponding to the most massive rich clus- 
ters), i.e. /passive = 1 at A/h = 10 15 Mq. Since we know the 
overall /passive in a given stellar mass bin, we can work out 
the power-law dependence 77(771*). Under this assumption, 
the SFR distribution at fixed m+ and Mh can be derived 
from the SFR distribution at fixed m* but with /passive mod- 
ulated by halo mass, i.e. 



$(V>|m*, Mh) ~ $(i/>|m*)/ paaa ive(Af h |m* 



(6) 



We will refer to this simplification as Scenario B. 

In Fig. 1101 we plot the average total SFR as a function 
of Mh at various redshifts. The left panel is for all galaxies 



and the right panel is for star-forming galaxies only (as de- 
fined in Section 2.4). Error bars include the uncertainty in 
the parametrised stellar-to-halo mass relation, the field-to- 
field variation in the SFR distribution as a function of stellar 
mass and the photometric redshift error. At 2 = 0.1, the er- 
rors on the SFR - Mh relation is very small because the error 
bars only include the uncertainty in the parametrised stellar- 
to-halo mass relation. The average SFR is higher/lower in 
less/more massive halos in Scenario B than in Scenario 
A. This is because Scenario B assumes that the /passive 
increases with increasing A/h- However, the difference in the 
SFR as a function of Mh between the two scenarios is small 
and does not affect the qualitative conclusions drawn in this 
paper. We can see that the intensity of star-forming activity 
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in halos in the probed mass range has steadily decreased as 
a function of time, dropping by over one order of magnitude 
from z ~ 2 to z ~ 0. The peak in SFR shifts from Mh just 
over 10 12 M Q at z ~ 2 to just below lO 12 A/ at z ~ 0.1, 
in qualitative agreement with Fig. [8] where the peak of the 
stellar-to-halo mass ratio (a measure of the integrated star- 
formation efficiency) is shown to shift towards lower mass 
halos over time. At a given redshift, halos in the mass range 
between a few times 1O 11 M0 and a few times 10 12 Mq are the 
most efficient at hosting star formation. Again, this is consis- 
tent with Fig. [8] which shows the integrated star- formation 
efficiency is low in both low- and high-mass halos and peaks 
at ~ 10 12 M . 

In Fig. 1111 Mh is evolved to z = using the halo 
mass accretion history derived from numerical simulations 
(Fakhouri et al. 2010), i.e. Eq. (6). So we can trace the star- 
formation history in the same halo along any vertical line. 
One can read off the evolutionary sequence of different pop- 
ulations of galaxies. Galaxies that are forming stars most 
actively at z ~ 2 have evolved into populations that reside 
in group-like environments at the present day and galaxies 
that are forming stars mostly actively in the present-day 
generally reside in field environment. This explains the re- 
versal of the SFR - density relation at high redshift first 
presented in Elbaz et al. (2007) and strongly supports pre- 
vious claims that the most powerful starbursts at z ~ 2 
(i.e. sub-mm galaxies) have evolved into today's elliptical 
galaxies in dense environment (e.g. Lilly et al. 1999; Smail 
et al. 2004; Swinbank et al. 2006). It is worth pointing out 
that our results on the redshift evolution of the average SFR 
as a function of halo mass are in good qualitative agree- 
ment with some recent results in the literature (Behroozi et 
al. 2012; Moster et al. 2013). A detailed quantitative com- 
parison (e.g., the impact of different methodology, different 
observations used to constrain the empirical model etc.) is 
beyond the scope of this paper. 



5 DISCUSSIONS AND CONCLUSIONS 

In the last ten years there has been an explosion of spec- 
troscopic and multi-wavelength photometric data charting 
the star-formation history and stellar mass build-up over a 
large fraction of cosmic time. And now the advent of Her- 
schel allows us to reliably probe the obscured star-formation 
activity in large numbers of high-z galaxies. In the near fu- 
ture, powerful space- and ground-based facilities will dra- 
matically increase sample size and allow robust measure- 
ments of galaxy properties to be made at even higher red- 
shift. 

In this paper, we present an extended halo model 
(EHM) of galaxy evolution which links stellar mass (m„) 
and SFR of galaxies to their underlying host halo mass (Mh) 
from the local Universe to z ~ 2. While the empirical rela- 
tion between m* and Mh has been constructed based on 
observations before, this is the first time the relation be- 
tween ip and Mh has been constructed from observational 
data over 80% of cosmic time. The _ffer.se/ie/-SPIRE obser- 
vations obtained as part of the Herschel Multi-tiered Extra- 
galactic Survey (HerMES) is crucial for obtaining accurate 
SFR estimates for dusty star-forming galaxies at high z. 

The EHM is built through two steps: 



• First, we build the CSMF $(m*[Mh), which specifies 
the average number of galaxies as a function of m* in a 
halo of a given mass. The CSMF, by construction, fits the 
SMF and the projected correlation functions as a function 
of stellar mass in the local Universe and the SMFs in various 
redshift slices in the distant Universe. The predicted clus- 
tering properties from our best-fit CSMF as a function of 
redshift also agree reasonably well with the measured cor- 
relation functions at high z (modulo integral constraint and 
cosmic variance effect). 

• Second, we extend the CSMF to the joint distribution 
in ip and m* as a function of halo mass, $(?/>, m*|Mh), by 
incorporating the distribution of SFR at fixed m* . We have 
used two scenarios in building $(ip, m*|Mh). Scenario A 
assumes that m* plays the most important role in deter- 
mining the SFR distribution of galaxies and the effect of 
Mh at fixed m* is negligible. Scenario B assumes that the 
SFR distribution at fixed m* has a power-law dependence on 
Mh. The difference in the resulting ip - Mh relation is small 
between the two different scenarios and does not affect the 
main conclusions presented in the paper. 

Combining the halo accretion history from numerical 
simulations and the 2-D distribution of m* and ip as a func- 
tion of Mh in various redshift slices &(ip,m*\Mh,z), we can 
trace the stellar mass growth and the evolution of SFR in 
different halos. Our most important findings are: 

• The intensity of the star-forming activity in halos in 
the probed mass range has steadily decreased over time, 
dropping by over one order of magnitude from z ~ 2 to 
z~0; 

• At each redshift, halos in the mass range between a few 
times 1O 11 M0 and a few times 10 12 Mq are the most efficient 
at hosting star formation, consistent with the optimum halo 
mass scale for star formation predicted from numerical sim- 
ulations; 

• The peak of SFR as well as the peak of the stellar-to- 
halo mass ratio (a measure of the integrated star-formation 
efficiency) shifts to lower mass halos as redshift decreases; 

• Galaxies that are forming stars most actively at z ~ 2 
have evolved into quiescent galaxies in group-like environ- 
ments at the present day. 

To further constrain the physical processes responsible 
for the ip - Mh relation and its evolution with redshift, fu- 
ture work is needed to investigate the role of three main sus- 
pects: molecular gas content and evolution, feedback from 
central massive black holes and environmental effects on 
star formation. The advent of the Atacama Large Millime- 
ter/submillimeter Array (ALMA), the Expanded Very Large 
Array (EVLA; Perley et al. 2011) and the Northern Ex- 
tended Millimeter Array (NOEMA) means that we are now 
in a position to be able to measure the evolution of the 
molecular gas content in a statistically significant sample of 
galaxies with moderate SFRs. The feedback from growing 
black holes may also impact the star-formation activity in 
massive halos (Mh > 10 13 Mq), as required in order to re- 
produce the observed stellar mass and luminosity functions 
of galaxies in numerical simulations and semi-analytic mod- 
els (e.g., Bower et al. 2006, 2008; Croton et al. 2006). We 
will extend the EHM to include the empirical relation be- 
tween AGNs and halo mass in a future paper to statistically 
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Figure 10. Left: The average SFR as a function of M^. Error bars include the uncertainty in the parametrised stellar-to-halo mass 
relation, the field-to-field variation in the SFR distribution as a function of stellar mass and the photometric redshift error. At z=0.1, 
the errors on the SFR - M^ relation is very small because the error bars only include the uncertainty in the parametrised stellar-to-halo 
mass relation. Different lines are colour-coded by redshift as indicated. The solid/dashed lines correspond to the i/> - M^ relation derived 
from Scenario A / B. The hatched regions indicate the M^ range where we are not able to derive reliable constraints on the ip - M^ 
relation due to the increasingly limited m* range probed towards higher z. Right: Similar to the left panel but for star- forming galaxies 
only. Since the SFR distribution at a given stellar mass for star-forming galaxies is independent of environment, only Scenario A is 
plotted. 
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Figure 11. Left: The average SFR as a function of halo mass at the present day. Mh is evolved to z = using the halo accretion history 
from numerical simulations (Fakhouri et al. 2010). The solid/dashed lines correspond to the ip - A^h relation derived from Scenario A 
/ B. Along any vertical line, we can trace the evolution of the SFR in the same halo. The dark grey / light grey / white region indicates 
Mjj range typically associated with cluster / group / field environment. For clarity, error bars are not shown here. Right: Similar to 
the left panel but for star-forming galaxies only. It is clear that the most actively star-forming galaxies at z ~ 2 reside in group-like 
environment and they evolve into quiescent galaxies in groups at the present-day. 



investigate star formation - black hole co-evolution. Finally, 
the impact of environment can be studied through galaxy 
group and cluster catalogues over a large redshift range and 
will be presented in a separate paper. 
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APPENDIX A: THE CONDITIONAL STELLAR 
MASS FUNCTION 

Motivated by studies of galaxy groups (Yang et al. 2005), 
we can divide the CSMF into that of central and satellite 
galaxies, 



$(m*|M h ) = $con(m*|M h ) + $ sat (m*|M h ), 



(Al) 



where $ C cn(m*\Mh) and $ sa t(m*|Mh) specify the number 
of central and satellite galaxies as a function of m* at fixed 
Mh respectively. A log-normal distribution is used to model 
the CSMF of central galaxies, 

log 2 (m <r /m c ) 



$ cen (m*|M h ) 



1 



■ cxp 



2a c 2 



(A2) 



/27rln 10m*o- c 

where m c (Mh) is the mean stellar mass of a central galaxy 
in a halo of mass Mh and a c — 0.2 dex is the standard devia- 
tion. Following Moster et al. (2010), m c (Mh) is parametrized 

as, 



m c (M h 



2M h 



MJo 



M lc 



+ 



M b 
Mic 



.(A3) 



M Jo 



m c (Mh) at the low and high halo mass end respectively, 
and Mic is the characteristic halo mass scale. A modified 
Schechter function is used to model the CSMF of satellites, 



$ sat (m*|M h ) 



m s 



exp 



where a is the low-mass end slope, 



Q = Qo + ct s x lo; 



(M h \ 



\M Q J 
<K is the normalisation, 



*...u„, = m|| 



(A4) 



(A5) 



(A6) 



and m s is the characteristic stellar mass in the distribution 
of satellites, 



m s (M h ) = 2M h 



MJo 



Mh 
M ls 



-Ps 



+ 



M h 

M ls 



(A7) 



which has the same functional form as m c (Mh). 

Equipped with the CSMF, we can calculate the abun- 
dance and clustering of galaxies. For example, the SMF can 
be derived as follows, 



(A8) 



$(m.) = / $(m,|M h )n(M h )dM h , 
Jo 

where n(Mh) is the halo mass function (HMF). In this paper, 
we use the HMF from Tinker et al. (2008). The galaxy power 
spectrum as a function of m* is 

P g ai(fc|m») = Pih(fc|m.) + P 2h (fc|m*)- (A9) 

The 1-halo term comes from galaxy pairs in the same halo, 

Pih(fc|m*) = ., 1 , 2 /'n(M h )[# sat (m^|M h ) 2 Mg (fc|M h ) 2 + 
<p(m*j J 

2<E>con(m*!M h )$ sat (m|M h )M g (fc|M h )]dM(A10) 

Here u g (fc|Mh) is the normalised Fourier transform of the 
galaxy density distribution within a halo of mass Mh, as- 
sumed to be an NFW profile (Navarro, Frenk & White 1997) 
truncated at the virial radius. The 2-halo term comes from 
galaxy pairs in separate halos, 



P 2h (fc|m* 



,,, /,, NW, N $(jTl*|Mh) /,,,, , 

dM h n(M h )b(M h ) \ ' ^ M g (fc|M h ; 



w(m*) 



P lin (fc) 



(All) 



Here P lm (fc) is the linear dark matter power spectrum, 
b(Mh) is the bias factor as a function of Mh (Tinker et al. 
2008). The projected correlation function at a given stellar 
mass is 



w p (r|m*) 



— P ga i(fcjm*)J (fcr), 



(A12) 



where Jo(x) — sin(x)/x is the zeroth-order Bessel function. 
There are a total of 11 parameters in the CSMF of the 
local Universe. We make use of Markov Chain Monte Carlo 
(MCMC) methods to derive the posterior PDF for all pa- 
rameters by fitting to the observed abundance and cluster- 
ing properties. Specifically, we use MCMC to minimise the 
reduced chi-squared 



X 2 = ^Sf*[($csMF(m0-$ob s (m,))/a*] 2 + 



^N„ 



N s 



^1 "ly"^! ' l( W PCSMF ~ W PabB> / a Wp] 



(A13) 



where A$ is the number of data points in the SMF, N T 
is the number of data points in each projected correlation 
function and N s is the total number of correlation functions. 
The best-fit value and the standard deviation for each pa- 
rameter is listed in Table Al. The correlation matrix of the 
parameters in the CSMF of the local Universe is shown in 
Table A2. 

To add in the redshift evolution of the CSMF, we adopt 
the following parametrisation to describe the evolving m* - 
Mh relation, following Moster et al. (2010). The evolution 
in the characteristic halo mass scale is parameterised as, 
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Table A2. The correlation matrix of the parameters describing the CSMF of the local Universe. 





logM lc 


(m c /M) 


/3c 


7c 


log Mi s 


(m s /M)o 


A 


7s 


- log $o 


tt() 


a s 


logMi c 


1.00 


-0.66 


-0.70 


0.79 


-0.54 


0.56 


-0.04 


-0.44 


-0.74 


-0.15 


0.11 


(m c /M) 




1.00 


0.44 


-0.11 


0.43 


-0.48 


0.06 


0.17 


0.59 


0.19 


-0.16 


/3c 






1.00 


-0.55 


0.13 


-0.15 


-0.12 


0.11 


0.48 


-0.11 


0.15 


7c 








1.00 


-0.38 


0.37 


-0.01 


-0.43 


-0.50 


-0.06 


0.03 


log M u 










1.00 


-0.99 


-0.02 


0.81 


0.62 


0.11 


-0.07 


(m 3 /M)o 












1.00 


-0.04 


-0.72 


-0.62 


-0.13 


0.09 


Ps 














1.00 


-0.18 


-0.01 


0.21 


-0.21 


Is 
















1.00 


0.58 


0.05 


-0.01 


- log $ 


















1.00 


0.17 


-0.09 


Q() 




















1.00 


-0.99 


a s 






















1.00 



Table Al. Parameters in the CSMF of the local Universe. The 
first four parameters describe the distribution of central galaxies 
as a function of m* at fixed M h , which is assumed to follow a 
log-normal distribution. The last seven parameters describe the 
distribution of satellites as a function of m* at fixed M h , which 
is assumed to follow a modified Schcchter function. 



Table A3. The redshift evolution parameters in the CSMF. 



parameter 


best-fit 


error 


description 


log Mi c 


11.70 


0.49 


characteristic halo mass 
in the m^/M^ ratio 


(m c /M) 


1.73 


0.07 


overall normalisation 


/3c 


1.16 


0.06 


power-law slope of m r /M^ 
at the low-mass end 


7c 


0.71 


0.03 


power-law slope of mt/M^ 
at the high-mass end 


log Mi „ 


12.62 


0.55 


characteristic halo mass 
in the rritt/M^ ratio 


(m„/M) 


2.32 


0.13 


overall normalisation 


13s 


2.38 


0.33 


power-law slope of m*/Mh 
at the low-mass end 


Is 


0.97 


0.05 


power-law slope of m*/Mh 
at the high-mass end 


- log $0 


13.11 


0.54 


overall normalisation in 
the number of satellites 


-ceo 


0.28 


0.11 


power-law slope in iV sa t 


-as 


0.06 


0.01 


power-law slope in iV sa t 
at the low-mass end 



logMi(*) = (1 + zY xlogMi|* =0 . 



(A14) 



And the overall normalisation in the stellar-to-halo mass 
ratio is parameterised as, 



(S)„w-< 1+ <x(S)j- 



(A15) 



Finally, the power-law slope at the high-mass and low-mass 
end are parameterised as 



7 (z) = (l + *r X7k=o, 
and 

/3(z) = /3U =0 +/3i xz, 



(A16) 



(A17) 



respectively. We use the SMF in the high-z Universe to con- 
strain the redshift evolution of the m, - Mh relation. The 
best-fit value and the standard deviation for each parameter 
is listed in Table A3. The correlation matrix of the 4 param- 
eters used to describe the redshift evolution of the CSMF is 



parameter 


best-fit 


error 


t* 


0.028 


0.010 


V 


0.780 


0.176 


/3i 


0.079 


0.133 


71 


-0.061 


0.268 



Table A4. The correlation matrix of the redshift evolution pa- 
rameters in the CSMF. 





/' 


V 


/3i 


71 


1' 


1.00 


0.57 


-0.64 


0.76 


V 




1.00 


-0.02 


0.85 


/3i 






1.00 


-0.27 


71 








1.00 



shown in Table A4. We have also tried to use 8 evolution 
parameters to allow different redshift evolution for the cen- 
tral and satellite population. However, the parameters are 
highly correlated and the uncertainties on these parameters 
are very large from MCMC chains. 



APPENDIX B: THE PROJECTED TWO-POINT 
CORRELATION FUNCTION 

The spatial two-point correlation function is often used to 
study galaxy clustering. It is defined as the probability of 
finding a galaxy pair at a given separation, in excess of that 
in a random Poisson distribution. We use the Landy & Sza- 
lay (1993) estimator 



£(r p ,7r) = 



1 
RR 



DD 



n r 

no 



- 2DR [ — ) +RR 

no 



(Bl) 



Here r p and n are the separations perpendicular and parallel 
to the line of sight, no and ur are the mean densities of the 
galaxy and random catalogues respectively. DD(r), DR(r) 
and RR(r) are numbers of weighted galaxy-galaxy pairs, 
galaxy-random pairs and random-random pairs at separa- 
tion r respectively. For volume-limited samples, the weight 
applied to each galaxy is 1. When generating random cata- 
logues for clustering calculation, the angular distribution of 
random galaxies is modulated by an angular mask, which 
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Figure Bl. The redshift-space correlation function ^(r Pt ir). The 
data from the first quadrant are repeated with reflection in both 
axes. The signal along the radial direction has been smoothed by 
a box filter of length 20 Mpc. 



is generated using the optical flags to take into account the 
selection effect. In Fig. IB1I we plot the £(r p ,ir) of galaxies 
with m, > 10 9 ' 8 Af© in the redshift bin z\ = [0.2,0.5], av- 
eraged over COSMOS and EOS. The signal from the first 
quadrant is repeated with reflection in both axes. In the ab- 
sence of peculiar velocity and redshift error, £(?"p,7r) should 
be isotropic. The elongation of the signal along tt leads to 
a reduction in the clustering amplitude. The problem can 
be overcome by integrating £(r p , 7f) along 7r to derive the 
projected correlation function, 



/~OC 

o p (r p ) = 2 / 5(r p ,7r)d7r. 
Jo 



(B2) 



Fig. IB1I also indicates that integrating £(r p ,n) out to n 
160 Mpc should capture all correlated signal. 
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